Speech Enhancement Using Gaussian Mixture Models, Explicit Bayesian Estimation and Wiener Filtering

نویسندگان

M. H. Savoji Shahid Beheshti University

S. Chehrehsa Auckland, New Zealand.

چکیده مقاله:

Gaussian Mixture Models (GMMs) of power spectral densities of speech and noise are used with explicit Bayesian estimations in Wiener filtering of noisy speech. No assumption is made on the nature or stationarity of the noise. No voice activity detection (VAD) or any other means is employed to estimate the input SNR. The GMM mean vectors are used to form sets of over-determined system of equations whose solutions lead to the first estimates of speech and noise power spectra. The noise source is also identified and the input SNR estimated in this first step. These first estimates are then refined using approximate but explicit MMSE and MAP estimation formulations. The refined estimates are then used in a Wiener filter to reduce noise and enhance the noisy speech. The proposed schemes show good results. Nevertheless, it is shown that the MAP explicit solution, introduced here for the first time, reduces the computation time to less than one third with a slight higher improvement in SNR and PESQ score and also less distortion in comparison to the MMSE solution.

Download for Free

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech quality estimation using Gaussian mixture models

We propose a novel method to estimate the quality of coded speech signals. The joint probability distribution of the subjective mean opinion score (MOS) and perceptual distortion feature variables is modelled using a Gaussian mixture density. The feature variables are sifted from a large pool of candidate features using statistical data mining techniques. We study what combinations of features ...

متن کامل

Objective Speech Quality Estimation using Gaussian Mixture Models

In this thesis, we propose the use of Gaussian mixture models (GMMs) as simple, yet effective predictors of perceived speech quality. A large pool of perceptual distortion features is extracted from speech files. Initially, statistical data mining algorithms are used to sift out the most relevant variables from the pool. We show that the five most salient feature variables are sufficient to con...

متن کامل

A Bayesian Filtering Algorithm for Gaussian Mixture Models

A Bayesian filtering algorithm is developed for a class of state-space systems that can be modelled via Gaussian mixtures. In general, the exact solution to this filtering problem involves an exponential growth in the number of mixture terms and this is handled here by utilising a Gaussian mixture reduction step after both the time and measurement updates. In addition, a square-root implementat...

متن کامل

Speech enhancement based on hypothesized Wiener filtering

We propose a novel speech enhancement technique based on the hypothesized Wiener filter (HWF) methodology. The proposed HWF algorithm selects a filter for enhancing the input noisy signal by first ‘hypothesizing’ a set of filters and then choosing the most appropriate one for the actual filtering. We show that the proposed HWF can intrinsically offer superior performance to conventional Wiener ...

متن کامل

Explicit segmentation of speech using Gaussian models

In this paper we investigate an automatic method to segment labeled speech. The method needs an initial estimation of the segmentation which is provided by an alignment based on HMM. Afterwards, the boundaries are refined moving the frontier frames to the segment which is more similar to the speech frame. Gaussian pdf are used as a similarity measure. The performance of the method is evaluated ...

متن کامل

Subspace based speech enhancement using Gaussian mixture model

Traditional subspace based speech enhancement (SSE) methods use linear minimum mean square error (LMMSE) estimation that is optimal if the Karhunen Loeve transform (KLT) coefficients of speech and noise are Gaussian distributed. In this paper, we investigate the use of Gaussian mixture (GM) density for modeling the non-Gaussian statistics of the clean speech KLT coefficients. Using Gaussian mix...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال

Iranian Journal of Electrical and Electronic Engineering

دوره 10 شماره 3

صفحات 168- 175

تاریخ انتشار 2014-09

دنبال کردن

لغو دنبال کردن

{@ msg @}

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

کلمات کلیدی

میزبانی شده توسط پلتفرم ابری doprax.com